Background read-only region creation by jmpesp · Pull Request #1919 · oxidecomputer/crucible

jmpesp · 2026-04-09T01:10:14Z

When the Crucible Agent is requested to create a read-only region from a remote Downstairs source, this currently blocks the worker thread as region creation is performed in the worker loop, and it cannot respond to other state changes.

This commit spawns region creation threads that the main worker thread can send requests to, and sends all read-only region creation requests there.

This builds on the previous work to separate the serialized on-disk types from the in-memory types: a Creating state is added to the in-memory type and used while this background creation is occurring.

When the Crucible Agent is requested to create a read-only region from a remote Downstairs source, this currently blocks the worker thread as region creation is performed in the worker loop, and it cannot respond to other state changes. This commit spawns region creation threads that the main worker thread can send requests to, and sends all read-only region creation requests there. This builds on the previous work to separate the serialized on-disk types from the in-memory types: a `Creating` state is added to the in-memory type and used while this background creation is occurring.

leftwo

Thanks for the work here, I have some questions for you.

agent/src/datafile.rs

agent/src/main.rs

leftwo · 2026-04-10T21:41:45Z

agent/src/main.rs

            let log0 = log.new(o!("component" => "worker"));
            let df0 = Arc::clone(&df);
-            std::thread::spawn(|| {
+            tokio::spawn(async {


If we are going from a real thread to a tokio task, could a long running region create trip us up here? The old way was with a thread which seemed like it could go off and do whatever for an hour and the rest of the agent could continue working. Do we run any risk of that here?

I'm not sure about all the differences between threads and tasks, but I don't think there's a risk. With worker running in a thread or with a task, the read/write region creation occurs separately from the dropshot server and datafile manipulation logic.

leftwo · 2026-04-10T21:47:19Z

agent/src/main.rs

-                                );
-                                df.fail(&r.id);
-                                break 'requested;
+                                std::process::exit(1);


what happens to the agent if we fail like this? Is it going to crash and restart?

I'm a little concerned that, if we fail here we restart the whole agent. If it's a persistant failure, then we get ourselves into a crash loop?

Looking at the error though, it's a failure to send the request over the channel to one of our worker threads, and that should be a difficult situation to reach correct? And, if we do see it, a restart of the process is likely to behave differently? I just want to avoid setting ourselves up for crash loop.

jmpesp requested a review from leftwo April 9, 2026 01:10

leftwo reviewed Apr 10, 2026

View reviewed changes

jmpesp added 2 commits April 13, 2026 14:31

change confusing error message

bacd702

drop references to the old agent

d482c4a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Background read-only region creation#1919

Background read-only region creation#1919
jmpesp wants to merge 3 commits intooxidecomputer:mainfrom
jmpesp:concurrent_read_only_clone

jmpesp commented Apr 9, 2026

Uh oh!

leftwo left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

leftwo Apr 10, 2026

Uh oh!

jmpesp Apr 13, 2026

Uh oh!

leftwo Apr 10, 2026

Uh oh!

jmpesp Apr 13, 2026

Uh oh!

leftwo Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jmpesp commented Apr 9, 2026

Uh oh!

leftwo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

leftwo Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

jmpesp Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

leftwo Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

jmpesp Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

leftwo Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants